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The definition and the properties of a Gaussian point distribution, in contrast to the well-known 
properties of a Gaussian random field are discussed. Constraints for the number density and the 
two-point correlation function arise. A simple method for the simulation of this so-called Gauss- 
Poisson point process is given and illustrated with an example. The comparison of the distribution 
of galaxies in the PSCz catalogue with the Gauss-Poisson process underlines the importance of 
higher-order correlation functions for the description for the galaxy distribution. The construction 
of the Gauss-Poisson point process is extended to the n-point Poisson cluster process, now incor- 
porating correlation functions up to the nth-order. The simulation methods and constraints on the 
correlation functions are discussed for the n-point case and detailed for the three-point case. As an- 
other approach, well suited for strongly clustered systems, the generalized halo-model is discussed. 
The infiuence of substructure inside the halos on the two- and three-point correlation functions is 
calculated in this model. 
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I. INTRODUCTION 

Stochastic models are often used to describe physical 
phenomena. For spatial structures two broad classes of 
stochastic models have been established. One approach is 
based on random fields and the other one on random dis- 
tributions of discrete objects, often only points, in space 
(see the contributions in for recent applications and 
reviews). Stochastic point distributions are used to de- 
scribe physical systems on vastly differing length-scales. 
The physical applications discussed in this article deal 
with the large-scale structures in the Universe formed 
by the distribution of galaxies. However, the methods 
are much more versatile. 

Models for the dynamics of cosmic structures are of- 
ten based on nonlinear partial differential equations for 
the mass density and velocity field. These models re- 
late the initial mass density and velocity field, primarily 
modeled as Gaussian random fields, to the present day 
values of these fields. The nonlinear evolution leads to 
non-Gaussian features in the fields. However, observa- 
tions supply us with the positions of galaxies in space. 
To compare theories with observations one has to re- 
late fields with point distributions. Both deterministic 
or stochastic models have been used for this purpose so 
far (e.g. |,|). 

Pursuing a direct approach, the observed spatial dis- 
tribution of galaxies (galaxy clusters etc.) is compared 
to models for random point sets. Only a few attempts 
towards a dynamics of galaxies as discrete objects have 
been made (see e.g. [||), however stochastic models are 
quite common. Following the works by |^-||, and |^ a 
purely stochastic description of the spatial distribution 
of galaxies as points in space is given in this article. 

Models for stochastic point processes can be con- 
structed using the physical interactions of the objects, 
typically leading to Gibbs processes (see e.g. [|lO|,0, and 
the generalizations by [p2|-p^). Another approach to 
construct point processes is based on purely geometri- 
cal considerations, e.g. points are randomly distributed 
on randomly placed line-segment (see ||l^,|l6|). As a 
third possibility one can start from the characterization 
of point processes by the probability generating func- 
tional (p.g.fl.) and the expansion in terms of correlation 
functions. This is the way pursued in this work. 

The simplest point process is a Poisson process show- 
ing no correlations at all. Since the galaxy distribution 
is highly clustered, a Poisson process is not a realistic 
model. The model with the next level of sophistication is 
a Gauss-Poisson process, the point distribution counter- 
part of a Gaussian random field. Whereas the properties 
of Gaussian random fields have been extensively studied, 
the Gauss-Poisson process has not been discussed in the 
cosmological literature in a systematic way. Some of the 
statistical properties of random fields directly translate 
to similar statistical properties of point distributions, but 
also important differences show up. The systematic in- 



clusion of higher-order correlations, as well as the charac- 
terization, and the simulation algorithms for such point 
processes will be discussed. 

Recently, a related class of stochastic models for the 
galaxy distribution, the halo-model, attracted some at- 
tention (see e.g. |17|-^0|). These models are based on the 
assumption that galaxies are distributed inside correlated 
dark matter halos. Using the probability generating func- 
tional, the two- and three-point correlation function will 
be calculated for this model, extending the results by |2^] 
to include the effects of halo-substructure. 

The outline of this paper is as follows: 
In Sect. H the properties of the probability generating 
hmctional (p.g.fl.) of a point process and the expan- 
sions in several types of correlation functions are briefly 
reviewed. The characterization of the Gauss-Poisson 



process is given in Sect. HI, and the physical conse- 
quences of the constraints are discussed. The close re- 
lation to Poisson cluster) processes allows us to sim- 
ulate a Gauss-Poisson process (see Appendix B 1 . In 



Sect. IV simulations of the Gauss-Poisson processes and 



the line-segment process are used to show how the Gaus- 
sian approximation influences the J(r)-function, a statis- 
tic sensitive to higher-order correlations. A comparison 
of the galaxy distribution within the PSCz survey with 
a Gauss-Poisson processes illustrates the importance of 
higher-order correlations. In Sect. ^ the extension of 
the Gauss-Poisson point process to the n-point Poisson 
cluster process is discussed. Detailed results are derived 
for the three-point Poisson cluster process (the simula- 
tion recipe is give in Appendix |B 2| ). The characteriza- 
tion of the general n-point process is discussed which is 
again detailed for the three-point case. Differences be- 
tween a point process and a random field are highlighted 
in Sect. VP Models for strongly correlated systems are 



mentioned in Sect. VII. The focus will be on the "halo 
model". Using the formalism based on the p.g.fl., the 
correlation functions of the "halo model" , includi ng the 
effects of halo-substructure, are calculated in Sect. VII B. 



In Sect. VII] some open pro blems are mentioned. An out- 
look is provided in Sect. |^ As an example the probabil- 
ity generating function (p.g.f.) of a random variable and 
its expansions in several kinds of moments is reviewed in 
Appendix 



II. PRODUCT DENSITIES, FACTORIAL 
CUMULANTS, AND THE PROBABILITY 
GENERATING FUNCTIONAL 

Probability generating functionals (p.g.fl. 's), and their 
expansions in different kinds of correlation measures have 
been used to describe noise in time series (e.g. [Q) and 
the electro-magnetic cascades occurring in air-showers 
(e.g. p3|). They have been employed in the theory of 
liquids(e.g. 124 ) and other branches of many-particle 
physics (e.g. [^|). The mathematical theory of p.g.fl. 's 
for point processes is nicely reviewed in the book of P6| . 
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Stochastic methods based on p.g.f.'s have been intro- 
duced to cosmology by § (the p.g.fl. was presented by 
p7t in the discussion of this article), and became well- 
known following the work of and j|] . Focusing on the 
factorial moments (the volume averaged n-point densi- 
ties) and on count-in-cells, 1^ discussed several expan- 
sions of the p.g.f.'s. In the following only "simple" point 
processes will be considered: at each position in space at 
most one object is allowed. This assumption is physically 
well justified for galaxies. Also, for quantum systems the 
methods should be refined (see e.g. ||2^ for fermionic (de- 
terminantal) point processes). 

An intuitive way to characterize a point process is to 
use nth-product densities: p,i(xi, . . . ,x„)dVL • • -dVn is 
the probability of finding a point in each of the vol- 
ume elements dVi to dVn- For stationary and isotropic 
point fields pi(x) — g is the mean number density, and 
the product density (with a slight abuse of notation) 
is 6i2(xi,X2) = g2{r) with r = |xi - X2I being the 
separation of the two points. The factorial cumulants 
C[„](xi, ... ,x„) are the irreducible or connected parts of 
the nth-product densities. E.g. for n = 2 



52 (r) = + C[2] (r) = (1 + 6(?-)) , 



(1) 



and the second factorial cumulant C[2] (r) and the two- 
point correlation function ^2(j') quantify the two-point 
correlations in excess of Poisson distributed points. 

A systematic characterization of a point process is pro- 
vided by the probability generating functional or a se- 
ries of probability generating functions (see Appendix^). 
Equivalent to a random distribution of points in space, 
one considers a point process as a random counting mea- 
sure. A realization is then a counting measure N, which 
assigns to each suitable set A C M'' the number of points 
N{A) inside. For suitable functions h{x) one defines the 
probability generating functional of a point process via 



G[h] = E 



exp ( / iV(dx)log/i(x) 



(2) 



where M.'^ is the d-dimensional Euclidean space, and E is 
the expectation value, the ensemble average over realiza- 
tions of the point process. Equivalently, 



G[h]=E[l[^h{^i) 



(3) 



where x^ are the particle positions in a realization. Con- 



sider k compact disjoint sets Aj, and let 
be the number of points inside Aj. 
dimensional random vector (rii, • • • 



- NiA,) 
The p.g.f. of the k- 

rik) is then 



Pfc(z) -Pfe(zi,... ,Zfc) =E 



r fe 

n 



(4) 



Together with a continuity requirement the knowledge of 
all finite dimensional p.g.f.'s Pk determines the p.g.fl. G 



and the point process completely (e.g. (2^). One obtains 
the p.g.f. Pk{z) = G[h'] of the random vector z using 



/i'(x) = l-^(l-^,)l^^.(x), 



(5) 



where ]1a(x) is the indicator-function of the set A, with 
]l^(x) = 1 for x g ^ and zero otherwise. Several expan- 
sions of the p.g.fl. G[h] are possible |Q. The expansion 
in terms of the product densities £>„ (the Lebesgue den- 
sities of the factorial moment measures) reads 

G[h+i]^i + y ^ f dxi • • • / dx„ 

£<„(xi, . . . ,x„) /i(xi) • • • /i(x„). (6) 

For the factorial cumulants C[„] or correlation functions 
one obtains |^ 

\ogG[h +l] = y ^ f dxi • • • / dx„ 

C[„](xi, . . . ,x„) /i(xi) • • • /i(x„) 

dxi • • • / dx„ (7) 



n—l 



^„(xi, ... ,x„) /i(xi) • • • /i(x„). 

As a third possibility the p.g.fl. can be expanded around 
the origin: 

G[h] = Jo + V ^ / dxi • • • / dx„ 

j„(xi,... ,x„) /i(xi) • • • /i(x„). (8) 

The Janossy densities jn(xi, . . . , x„)dVi • • • dVn are the 
probability that there are exactly n points, each in one 
of the volume elements dVi to dVn- Con verge nce issues 
of these expansions are discussed in Sect. VIII. 



III. THE GAUSS-POISSON POINT PROCESS 

For a stationary Poisson process with mean number 
density g the p.g.fl. is 



logG[/i + 1] = g [ dx /i(x). 



(9) 



^The relations to the generating functionals TZ, T and Q 
defined by [g are G\h\ = TZ[h], G[h] = :F[h - 1] and G[h] = 
exp Q[h — 1]. 
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corresponding to a truncation of the expansion after 
the first term. Truncating after the second term, one 
obtains the p.g.fl. for the Gauss-Poisson process [p9|,p0[ 



logG[/i + 1] = g [ dx h{x) + 

■ / dx / dy 6(|x-y|) /i(x)/j(y), (10) 



El 
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completely specified by its mean number density g and 
the two-point correlation function ^2 (r) . 

There is a close resemblance to random fields. For 
a homogeneous random field /o(x) with mean E[/9] = p 
the density contrast is defined by (5(x) — (p(x) — 
A homogeneous and isotropic Gaussian random field is 
stochastically fully specified by its mean p and its cor- 
relation function ^|(|x - y|) = E[S{x)5{y)] Here, 
E is the average over realizations of the random field. 
The higher (connected) correlation functions = with 
n > 2 vanish. Similar the correlation functions ^„ for 
n > 2 vanish in a Gauss-Poisson process. However, also 
important differences between a Gaussian random field 
and a Gauss-Poisson point process show up. 



A. Constraints on (,2(r) and g 

A functional G[h] defined by Eq. ( [lO| ) is a p.g.fl. of a 
point process if and only if the Pfc(z) as given in Eq. (^) 
are probability generating functions (p.g.f.'s). This will 
lead to restrictions on the two-point correlation function 
^2 and the number density g as discussed by |^ and Q . 
A Pfc(z) given by Eq. (^) always has to be positive and 
monotonic increasing with each component z,; of z, and 
hence logPfe(z) is non-decreasing in each component of 
z. With Eqs. (|), (|) and (|l^) one gets 



aiogPfc(z) 



dzi 



g\Ai\ + 



k 

^ / dx / dy6(|x-y|)(z,-l)>0 (11) 



for any Zj > 0, where \Ai\ is the volume of the set Ai. 
The rather obvious constraint p > can be derived by 
setting Zj — 1. For Zj = l,j 7^ i, and either = or 
Zi^ 1 the following two non-trivial constraints emerge: 



_Q 

\A 

dx 

Ai J A. 



f dx [ dye2(|x-y|)<l 

l\ J A, JAi 

\Al\ z,~,oc 



A, J A 

dy 6(|x-y|) > 



(12) 
(13) 



One can show that these two conditions provide a nec- 
essary and sufficient characterization of ^2('') and p, to 
assure that G[/i], as given in Eq. (p^, is a p.g.fl. pO| ]. 



Eq. (12) constrains the shape and normalization of the 
two-point correlation functions admissible in a Gauss- 
Poisson process. For Ai = A = Ai 



Q_ 

\A\ 



^ / dx / dy 



N a^{A) < 1, (14) 



where cr'^{A) are the fluctuations of count-in-cells in ex- 
cess of a Poisson process, and N = g\A\ is the mean 
number of points inside the cell A. Hence, the total 
fluctuations of the number of points TV inside A for a 
Gauss-Poisson process are 



E[{N-Nf] ^N + N^ a"^ 



(A) < 2 N, (15) 



and must not exceed twice the value of the fluctuations in 
a Poisson process (cr^ — 0) for any domain A considered. 
Another way of looking at constraint ( p^ ) is by taking Ai 
as an infinitesimal volume element centered on the origin 
and Aj equal to some volume A: 



dy 6(|y|) < 1- 



(16) 



Consistent with Sect. HIB this tells us that sitting on a 
point of the process on average at most one other point 
in excess of Poisson distributed points can be present. 
Now consider two volume elements Ai = dVi and Ai = 



dV; separated by a distance of r, then Eq. (13) implies 



6(0 > 



(17) 



Hence, only clustered point distributions can be modeled 
by a Gauss-Poisson process. Any zero crossing in ^2('') 
already indicates the presence of higher-order correla- 
tions. 



B. A Gauss-Poisson process as a Poisson cluster 
process 

A Gauss-Poisson process can be interpreted as a sim- 
ple Poisson cluster pr oces s. This is important for simu- 
lations (see Appendix B 1). 

A Poisson cluster process is a two-stage point process. 
First one chooses Poisson distributed cluster centers, the 
"parents" , with number density gp and then attaches a 
second point process - the cluster to each cluster center 
(the cluster center is not necessarily part of the point 
process). The p.g.fl. of a Poisson cluster process is then 
given by [pi 



logG[/i] 



dx gp{Gc[h\x] - I) 



(18) 



with Gc[/i|x] being the p.g.fl. of the point process form- 
ing the cluster at center x. Now consider the p.g.fl. of 
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a cluster with at maximum two points (compare with 
Eq. (i), 



Gc[/i|x] -gi(x)Mx)+g2(x)/i(x) / dy/(|x-y|) ^y), 

(19) 

where gi(x) is the probability that only one point, the 
cluster center at x, is entering the cluster, whereas 92 (x) 
is the probability that a second point is added. Clearly, 
qi(x) + (72(x) ~ 1. The probability density /(|x — y|) 
determines the distribution of the distance |x— y| of the 
second point y to the cluster center, and is normalized 
according to / dz/(|z|) = 1. By writing /(|x — y|) one 
assumes that the probability density / is symmetric in 
X and y. Indeed, the p.g.fl. Eq. (Hq) is invariant under 
interchanging x and y, and this assumption does not im- 
pose any restrictions. Using this expression and Eq. ( p^ ) 
one obtains 

\ogG[h + 1] - / dx gp{l + 92(x)) h{x) + 

+ / dx / dy epg2(x)/(|x-y|) /i(x)/^(y), (20) 

which equals the p.g.fl. for the Gauss-Poisson pro- 
cess (|l^ for Q = Bp{l+q2) and ^2{r) = 2^/(r). Hence 
every Gauss-Poisson process is a Poisson cluster process 
of the above type, and vice versa. 



these features in the distribution of galaxies and galaxy 
clusters. 

Also a percolating cluster shows scale-invariant corre- 
lations. The correlation length, specifying e.g. the expo- 
nential cut-off of the two-point correlation function, is 
going to infinity near the percolation threshold. There- 
fore, the geometry of the largest cluster cannot be mod- 
eled with a Gauss-Poisson processes. Higher-order cor- 
relations are essential to describe the morphology of such 
a system. This again illustrates that the tails of the dis- 
tributions, in this case the asymptotic behavior of the 
two-point correlation function is essential. 

To summarize these results: already by looking at the 
two-point correlation function and the density one is able 
to exclude a Gauss-Poisson process as a model. However 
one cannot turn the argument around and show that 
a given point distribution is compatible with a Gauss- 
Poisson process using the two-point correlation function 
alone. There are point processes with higher-order cor- 
relation s sati sfying the constraints (|l^,^3|) as discussed 
in Sect. llV A|. 



IV. DETECTING DEVIATIONS FROM A 
GAUSS-POISSON PROCESS 

After having outlined the basic theory of a Gauss- 
Poisson process, we discuss in this section how one can 
detect non-Gaussian features in a given point set. 



C. Physical implications 



A. The line— segment process 



From the preceding section one concludes that at max- 
imum two points form a cluster in a Gauss-Poisson pro- 
cess. Therefore, no point distribution with large-scale 
structures can be modeled reliably with this kind of pro- 
cess. This has physical implications both for the galaxy 
distribution and percolating/critical systems. 

More specific, from the observed galaxy distribution 
a scale-invariant two-point correlation function ^2('') = 
Ar~'^ with 7 w 1.8 is deduced. Clearly such a correla- 
tion function does not satisfy the constraint (|l^). For 
< 7 < 3 a cut-off at large scales has to be intro- 
duced. For the galaxy distribution a cut-off at approx- 
imately 20/i~^Mpc is the lowest value which is roughly 
compatible with the observed two-point correlation func- 
tion. Taking into account the observed number density 
of the galaxies, a cut-off even on this small scale does 
not help. Still the constraint ( [l^ ) is strongly violated, 
indicating non negl igible higher-order correlation func- 
tions (see also Sect. IV C] ). Similarly, a zero crossing or a 
negative ^2('*) is violating the constraint ( |l3| ) and also in- 
dicates that higher-order correlations are present. There 
are indications that the distribution of galaxies shows a 
negative ^2('') on some scale larger than 20/i~^Mpc, fol- 
lowed by a positive peak at approximately 120/i~^Mpc 
l2|-p3 . A Gauss-Poisson process is not able to describe 



First a two dimensional analytic example is studied. In 
the line-segment process points are randomly distributed 
on line segments which are themselves uniformly dis- 
tributed in space and direction. The number of points 
per line segment is a Poisson random variable. Accord- 
ing to H, p. 286 



for r < I 
for r > I: 



(21) 



I is the length of the line segments and gs is the mean 
number density of line segments; Igs, g/ Qs, Q denote the 
mean length density, the mean number of points per line 
segment (which can be smaller than one), and the mean 
number density in space, respectively. A similar model 
for the distribution of galaxies was discussed by ||l^ . On 
small scales r <C ^, £,2{t) oc r~^, qualitatively similar to 
the observed two-point correlation function in the galaxy 
distribution. 

This structured point process incorporates higher- 
order correlations. In Fig. |l| the line-segment process 
is shown in comparison to a Gauss-Poisson process with 
the same two-point correlation function for the parame- 
ters I = 0.1, gs = 201, g = 200, and g = 500. A number 
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density g > Qs violates the constraint Eq. and no 
Gauss-Poisson process equivalent on the two-point level 
to such a line-segment process exists. 



■(b)- 




FIG. 1. The plot (a) shows a realization of the 
line-segment process inside the unit square with number den- 
sity Q — 200 (for the other parameters see the text) and the 
corresponding two-point correlation function ^2{r) (r is in 
units of the box length): the dashed one-cr area was deter- 
mined from 1000 realizations, the theoretical value is given 
by the dashed line, nearly on top of the sample mean (solid 
line). The plot (b) shows a realization of the Gauss-Poisson 
process with q = 200 and the plot (c) a realization of the 
high density line-segment process with q — 500, both with 
the corresponding two-point correlation function. 



B. Detecting higher— order correlations 



a method sensitive to three-point correlations , or di- 
rectly calculate the higher moments |^3|-^,p|i~"^ In 
following the J-function is used to quantify the higher- 
order clustering |4^-|4^]. 

To define J-function the spherical contact distribution 
F{r) is needed, i.e. the distribution function of the dis- 
tances r between an arbitrary point and the nearest ob- 
ject in the point set. F{r) is equal to one minus the 
void-probability function: F{r) ~ 1 — Po(r). Another 
ingredient is the nearest neighbor distance distribution 
G(r), defined as the distribution function of distances r 
of an object in the point set to the nearest other point . 
For a Poisson process the probability to find a point only 
depends on the mean number density leading to the 
well-known result 



Gp(r) = l-exp(-e|B,|) =Fp(r), 



(22) 



where \Br\ is the volume of a d-dimensional sphere with 
radius r. The ratio 



J(r) = 



1 - Gjr) 
1 - F{r) 



(23) 



was suggested by as a probe for clustering of a point 
distribution. For a Poisson distribution J(r) — 1 follows 
directly from Eq. (^2|) . A clustered point distribution im- 
plies J(r) < 1, whereas regular structures are indicated 
by J{r) > 1. As discussed in [|7| one can express the J{r) 
function in terms of the n-point correlation functions ^„: 

J(r) = 1 + V ^-fl^ / dxi-../ dx, ei+i(0,xi,... ,Xi) 
Jb,. Jb,^ 

(24) 



1=1 



Br is a d-dimensional sphere with radius r centered on 
the origin. For a Gauss-Poisson process in two dimen- 
sions, i.e. ^„ = for n > 2, the above expression simpli- 
fiesQ: 



TT / ( 

Jo 



J(r) = l-g27r / ds s ^2{s) 



(25) 



■^Unfortunately discussed this Gaussian approximation 
with examples of two-point correlation functions, which are 
not admissible in a Gauss-Poisson process. 



As can be seen from Fig. the point processes are 
indistinguishable on the two-point level. For another ex- 
ample see The differences between these point 
distributions can be investigated with statistical meth- 
ods sensitive to higher-order correlations. One may use 
Minkowski functionals ( for reviews see p8| , p9[ ), per- 
colation techniques , the minimum spanning tree pl , 
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FIG. 2. The spherical contact distribution F{r) (plot (a)), 
the nearest neighbor distribution G{r) (plot (b)) and the J(r) 
function (plot (c)) are shown for the Gauss-Poisson process 
{q — 200, mean value (dashed line) and variance (shaded area) 
estimated from 1000 realizations), and the line-segment pro- 
cess {q — 200, solid line); r is in units of the box length. The 
J(r) function according to Eq. ( p5| ) (dashed dotted line) lies 
on top of the estimated mean. The dotted line is marking 
the results for a Poisson process. The sequence of solid lines 
in plot (d) are the J(r) function for line-segment processes 
with Q = 100, 150, 200, 300, 400, 500, 600, bending down suc- 
cessively. 

In Fig. H the results for F{r), G{r), and J{r), es- 
timated from several line-segment processes, and the 
Gauss-Poisson process are shown; all the processes in- 
vestigated had the same two-point correlation function 
^2('') given in Eq. (pT|). The line-segment process al- 
lows for larger voids than the Gauss-Poisson process, as 
seen from inline > ^gp- On small scales the J{r) of the 
line-segment process is well approximated by the J{r) for 
the Gauss-Poisson process. However on large scales the 
Gauss-Poisson process shows significantly smaller J(r) 
function than the line-segment process. The J{r) func- 
tion is known analytically for several point process mod- 
els 1 46 4^. In any of these cases a smaller J(r) is an 



indication for stronger (positive) interaction between the 
points (see also [|oj51 ). Specifically for Gibbs-processes 
(see e.g. [|5)) an attractive interaction leads to a mono- 
tonically decreasing J(r) and a stronger interaction leads 
to smaller values of J{r). Hence, the presence of higher- 
order correlation functions in the line-segment process 
gives rise to a reduced clustering strength, in the sense 
discussed above. Clearly, the signal of J(r) also depends 
on the number density. 



C. The non— Gaussian galcixy distribution 

As already mentioned, the three-dimensional distribu- 
tion of galaxies cannot be modeled in terms of a Gauss- 
Poisson process: the constraints on the density and two- 
point correlation function are violated. In the follow- 
ing this is illustrated with a volume-limited sample of 
100/i^^Mpc depth, extracted from the PSCz galaxy cata- 
logue 1^ . The volume-limited sample incorporates 2232 
galaxies with galactic latitude |6| > 5°. A detailed de- 
scription of the sample considered here may be found in 
psf . Estimators for the two-point correlation function 
are quite abundant (see |Q and references therein) . The 
results presented here do neither depend on the estima- 
tor, nor on the exact sample geometry, which is indeed 
more complicated (see fS^) . For the J(r)-function the 
minus estimator is used |15| , p5t . 

In Fig. ^ the estimated two-point correlation function 
is shown. The integral 



dy6(|y|)«4.4 



(26) 



is violating the constraint (|l6|), and the corresponding 
Gauss-Poisson process does not exist. Indeed higher- 
order correlations functions have been detected by ]5^ ] 
using factorial moments. By thinning the galaxy dis- 
tribution (i.e. randomly sub-sampling), one generates a 
point set with the same correlation functions ^„ as the ob- 
served galaxy distribution, however with a reduced num- 
ber of points. Since the number density enters linearly in 
the constraint (p^, a comparison of the thinned galaxy 
distribution with a Gauss-Poisson process becomes fea- 
sible. The strongly interacting galaxy distribution, as 
indicated by the small values of J{r), shows increasingly 
weaker interaction (higher values of J(r)) for the diluted 
subsamples (Fig. |3|). 

Now consider a sample with only 20% of the actual 
observed galaxies, where the constraint (16) is satisfied 



(compare with (|26|)). This diluted sample is compared to 
a Gauss-Poisson process with the same two-point corre- 
lation function. The ^2 (f) determined from the simulated 
Gauss-Poisson process matches perfectly with the ob- 
served correlation function (Fig. On small scales the 
J(r)-function of the thinned PSCz is reasonably modeled 
by the Gauss-Poisson process. However, on large scales 
the Gauss-Poisson process shows stronger interactions, 
whereas the thinned galaxy sample, with its higher-order 
correlation functions, shows weaker interactions in the 
sense discussed in Sect. IVB. 
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(a) 




r [Mpc/h] 



r [Mpc/hJ 



FIG. 3. In plot (a) the observed two-point correla- 
tion function ^2(7') of the volume-limited subsample with 
100/i~^Mpc depth from the PSCz galaxy catalogue is shown 
(solid line). The dotted line and the one-cr area are esti- 
mated from 200 realizations of a Gauss-Poisson process using 
the estimated two-point correlation function as an input but 
with only a fifth of the number of points. In plot (b) the 
J{r) function of the same sample is shown with 100% (solid 
line), 50% (short dashed line), and 20% of the galaxies (long 
dashed line). The shaded area is the one-cr region obtained 
from the Gauss-Poisson process corresponding to the galaxy 
sample with only 20% of the points. 



V. POINT PROCESSES WITH HIGHER-ORDER 
CLUSTERING 

As already mentioned, the measured two-point corre- 
lation function of the galaxy distribution together with 
the observed density of galaxies violates the constraints 
Eqs. (^2|) and (|l3|). Consequently the distribution of 
galaxies cannot be modeled with a Gauss-Poisson pro- 
cess. Even more compelling, there is a clear detection 
of higher-order correlations in the galaxy distribution 
(e.g. p^ , p7| , p8[ ) . Hence, one is interested in analytical 
tractable approximations of the cumulant expansion (|^). 
Hierarchical closure relations have been extensively stud- 
ied (see Sect. |Vn A ). In the following a truncation of the 
expansion (|^) beyond the Gaussian term and the n-point 
Poisson cluster processes will be used. 

Such a truncation may serve as a model for the galaxy 
distribution in the weakly nonlinear regime. Using per- 
turbation theory one may show that ^3 oc (^2)^ (see [p9|). 
For large separations r the correlation function ^2(7') is 
smaller than unity, and consequently a truncation of the 
expansion (^ at n > 2 provides a viable model for the 
large-scale distribution of galaxies. 

The general Poisson cluster process is the starting 
point: consider the expansion of the cluster p.g.fl. G'c[/i|x] 
in terms of Janossy densities conditional on the cluster 
center x (see Eq. (§)): 



Gc[h\i 



+ £ ^ / dxi • • • / 



dx„ 



Explicit expression for the Janossy densities are given 
below. The p.g.fl. of a Poisson cluster process is then 
given by 

G[h] = exp J ^ dxg, (G,[/i|x] - 1)^ 

= exp ( / dxQc ( ^ / dxi • • • / dx„ 

Jx) ;i(xi)---/i(x„)-l)^. (28) 



jn (^1 ! ■ ■ ■ ^7) 



Here the probability qo of having no point in the cluster at 
x is assumed to be zero, i.e. Jq — 0. This does not impose 
any additional constraints, it only leads to a redefinition 
of the number density of cluster centers g[ = Pc(l + Qo)- 
Using this more formal approach the p.g.fl. of the 
Gauss-Poisson process can be written in terms of the 
Janossy densities with jn = for n > 2: 



ji(xi|x) = qi(x)(5^(x - xi), 

J2(X1,X2|X) =q2(x)(5^(x-Xi)/2(X2|xi) 2!. 



(29) 



Here ji (J2) sue the probability densities for the spatial 
distribution of one (two) points in the cluster, multiplied 
by the probability qi (92) that there are exactly one (two) 
points in the cluster at x. is the d-dimensional Dirac 
distribution. /2(x2|xi) is the probability density of the 
second point X2 under the condition that there is a point 
at Xi, normalized by / dx2 /2(x2|xi) — 1. 

The p.g.fl. ( ^ ) is invariant under changes of the 
order of integration, implying that one can use the 
j„(xi, . . .x„|x) symmetrically defined in all coordinates 
(including x). With the additional assumption of homo- 
geneity and isotropy on e gets /2(x2|xi) = /(|xi - X2I), 
as already used in Sect. [II B for the construction of the 
Gauss-Poisson process. 



A. The three— point Poisson cluster process 



In a three-point Poisson cluster process Eq. (^ is 
truncated at the third order and at most three points 
per cluster are allowed. Additional to Eq. ( [29| ) 

j3(xi,X2,X3|x) = q3(x)(5^(x - Xi)/3(x2,X3|xi) 3! (30) 

appears, with the probability <Z3(x) that the cluster con- 
sists out of three points, and 91+92 + 93 = 1 with 
9i > 0. /3(x2,X3|xi) is the probability density that 
there are two points at X2, and X3, under the condi- 
tion that one point is at Xi, with the normalization 
J dx2 / dx3 /3(x2,X3|xi) = 1. Inserting these definitions 



j„(xi, . . .x„|x) /i(xi) • • • /i(x„). (27) 
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one obtains 



\ogG[h] = / dxi £»p(7i(xi) (/i(xi) - 1) + 

JTS.'' 

+ dXi £lp / dx2 q2(xi)/2(x2|xi) (/i(xi)/i(x2)-l) + 

+ / dxi Qp / dx2 / dx3 g3(xi)/3(x2,X3|xi) x 

X (/i(xi)/i(x2)/i(x3)-l). (31) 

As already mentioned, /3(x2,X3|xi) can be assumed to 
be symmetric in its three arguments. Slightly abusing 
notation, let /2(xi,X2) and /3(xi,X2,X3) be the sym- 
metrically defined densities corresponding to /2(x2|xi) 
and /3(x2, X3IX1), and define 



fi^^ > and consequently (,2{r) > for all r, as well 
as ^3 > 0. This is a generic feature of Poisson cluster 
processes. 

The Gauss-Poisson process, defined through the trun- 
cation of the cumulant expansion after the second term, 
is equivale nt to the two-point Poisson cluster process 
(see Sect. Ill B ). Unfortunately, this equivalence does 
not hold for the higher n-point processes anymore. The 
general three-point process is defined as point process 
with a factorial cumulant expansion tru ncate d after the 
third term. Proceeding similar to Sect. Ill A, necessary 
conditions for the existence of such a point process can 
be derived (compare with Eq. (O)): 



< g\Ai\ + J2 g^\A\\AMAi,A,)iz, - 1) 



(xi,X2)^£dx3/3(xi,X2,X3). (32) +^^^|A,||A,||A,.|C3(A,,A,^,)(^.-1)(^,-1), 



1=1 j=l 



Replacing h hy h + 1 and rearranging the terms the fac- 
torial cumulant expansion of the three-point cluster pro- 
cess reads 

\ogG[h + l] = / dxi /i(xi) £>p(l + q2(xi) + 2(j3(xi)) + 

+ / dxi / dx2 /l(xi)/l(x2) X 

X £'p('72(xi)/2(xi,X2) +3(73(xi)/i^^(xi,X2)) -I- 
-I- / dxi / dx2 / dx3 h{xi)h{x2)h{x3) x 

jR<i jR<i jR-i 

X £<pg3(xi)/3(xi,X2,X3). 

(33) 

Comparing Eq. ( p3| ) with the expansion (|^) one arrives 
at 

{qi+ 2(72 + Sq3)gp = (1 + <72 + 2q3)gp 



6(xi,X2) = 



C3(xi,X2,X3) = 



2! 



(1 + 92 + 2q3)^gp 

X ('72/2(xi,X2) + 3^3/2^^^ (xi,X2) 

3! 



(1 + 92 + 293)3^2 



'73/3(xi,X2,X3), 



(34) 



and the correlation functions ^„ equal zero for n > 4. The 
simulation procedure for the thre e-po int Poisson cluster 
process is described in Appendix B2. 



B. Constraints on q, ^2 and ^3 

By the definition of the three-point Poisson cluster 
process, the probability densities /2 > 0, /3 > 0, and 



(35) 



with the volume-averaged correlation functions 



£,n i-Ai , ■ ■ ■ , An ) 



1 



Ai\---\A„\ 

X dxi---/ dx„ ^„(xi, . . . ,x„), (36) 
Jai Ja„ 

and for consistency ^i(^) ~ 1. Again, for Zi — 1 one 
obtains g > 0. The non-trivial constraints read: 

1 > g\As\UAi,As) - Q^\As\%{Ai,A,,A,), (37) 
0<UA,,As,A,), (38) 
1 > g\As\^2{Ai,As) + g\Ar\^2{Ai,Ar) - 



^ -\A,\'UAi,A,,A,) - ^\Ar\^UAi,Ar, Ar) 



g^\As\\Ar\UAl,As,Ar). 



(39) 



Eq. (^) can be derived by setting Zg = and Zi = 1 
for all J ^ s, Eq. (|3^) follows from Zs ^ 00 and Zi — 1 
for all i ^ s. Using Zr,Zs — > 00 does not lead to new 
constraints. With Zj. = = Zg, r ^ s, and z,; = 1 for all 
i r,s one obtains Eq. ( ^^ . No additional constraint 
arises by setting z = 2. 

Eq. (11) implies $3 > 0. Eq. (jp) and Eq. (|3|) are 
the extension of the constraint (|l4|)r The terms propor- 
tional to ^3 can balance the terms with and a clus- 
tering point processes with a number density higher than 
in a Gauss-Poisson process is possible. Moreover, ^2 is 
not constrained to positive values anymore. Hence, al- 
ready by including three-point correlations, a point pro- 
cess model with a two-point correlation function ^2 hav- 
ing a zero crossing becomes admissible. This answers 
affirmatively the question by [ |60t , whether there exists a 
general three-point cluster processes with a negative sec- 
ond moment. However, in the three-point Poisson cluster 
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process discussed in the preceding section a ^2 > is re- 
quired illustrating that the three-point Poisson cluster 
processes form only a subset of all possible three-point 
processes. 



C. The n— point Poisson cluster process 

It is now clear how to construct the n-point Poisson 
cluster process. Let be the probability of having m 
points per cluster with X)m=i 9m = 1- 



j„(xi, . . . ,x„|x) = g„(x)5-°(x - xi)/„(x2, 



,x„|xi) n\, 
(40) 



determines the distribution of the n points inside the 
cluster (/i = 1). As above the /„ are assumed to be 
symmetric in all their arguments, and for n > m 



/i")(xi,... ,x„,)= / 



dx- 



m+l • 



dx„ 



/n(xi 



,x„), (41) 



and /,„ — fm^^ . Inserting Eq. ( [40| ) into Eq. ( |28| ) and after 
some algebraic manipulations one can compare term by 
term with the factorial cumulant expansion (0) of the 
p.g.fl.: 



m—l 



(42) 



Cfc(xi 



7n—k 



f(™)/ 



,Xfe), 



with k < n, and = for k > n. The statistical 
properties of this n-point Poisson cluster process are now 
completely specified by the correlation functions with 
k < n and the mean density g. Eqs. ( p2| ) and the nor- 
malization of the fm can be used to determine the as 
well and qm from given correlation functions ^„ and 
the number density g. A simulation algorithm similar to 
the one described in Appendix B 2 can be constructed. 



D. The general n— point process 

The general n-point process is defined as the point 
process resulting from the factorial cumulant expansion 
truncated after the nth term. Proceeding similarly to 
Sect. Ill A one arrives at the constraint equations 



Q>Q\Ai\ + g'Y^ \Ai\\A,\UAuA){z, - 1) 



i=l 



Q n 



E |A,||A,J...|A,„_Jx 

ii,... ,i„_i = l 

xe„(Ai,A,,,... ,A,„_J(z,, -l)...(z,„_, -1). (43) 



It is now possible to compute the constraints for the n- 
point process, in close analogy to the three-point process 
in Sect. VB. |60) gave necessary and sufficient conditions 
for the existence of a generalized Hermite distribution 
(closely related to this n-point process). They discuss 
the constraints for a slightly different expansion of the 
p.g.f. Unfortunately, the transformation of their expan- 
sion to the expansion in terms of correlation functions is 
as tedious as the direct calculation of the constraints. 



VI. RANDOM FIELDS VS. POINT PROCESSES 

A random field u(x) is in the simplest case a real- 
valued function on In cosmology the initial 
mass-density field is often modeled as a Gaussian ran- 
dom field (see e.g. P,|6T|). The nonhnear evolution of the 
density field unavoidably introduces higher-order corre- 
lations. A random field u{x) is stochastically character- 
ized by its characteristic functional (e.g. |p2|) 



exp I 




dx w(x)u(x) 



(44) 



where E" denotes the expectation value over realizations 
of th e ra ndom field u. In close analogy to the expan- 
sion (A5) of the characteristic function of a random vari- 
able in terms of cumulants, one obtains the expansion of 
the characteristic functional 



— / dxi 



n=l 



dx„ 



c;j(xi, . . . ,x„) i;(xi) • • •i;(x„) (45) 



in terms of n-point cumulants cj^(xi,... ,x„). Here 
c"(x) = E"['u(x)] = u is the mean value. The correla- 
tion function of the field is 



C^(X1,X2) E"[»(xi)t.(x2)] 
?2 lXl,X2j — — ^2 ^' 



(46) 



and similar for higher-order correlation functions (see 
e.g. |6|,^). The well known characteristic functional 
of the Gaussian random field reads 

ln<i>"[t)] — i / dxi ?It>(xi) — 

-\ ( dxi / dx2 4(xi,X2) w(xi)w(x2), (47) 

with the covariance function C2(xi,X2). 

The characteristic functional of a point process is de- 
fined by 



^[h] = E 



exp I 



Af(dx)/i(x; 



(48) 
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and the relation to the p.g.fl. is ^[h] = G[e^^]. An ex- 
pansion into cumulants Cfc(xi, ...,Xfe) is also possible: 



OC .1. 



fe=l 



\ogm=Y.T\ / dxi-.. / dx 



Cfe(xi, . . . ,Xfc) /i(xi) • • • /i(xfc). (49) 



The cumulants Cfc(-) should not be confused with the fac- 
torial cumulants C[k]{-)- 



A. A theorem of Marcinkiewicz 

A theorem of Marcinkiewicz |6^] states that if the char- 
acteristic function ip{s) of a random variable (see Ap- 
pendix ^) is the exponential of a polynomial with finite 
degree larger than two, then the positive definiteness of 
the probability distribution is violated (see e.g. []65|-|67|). 
The generali zed M arcinkiewicz theorem for characteristic 
functionals 



69 1 tells us that this expansion has to be 



either infinite or a polynomial in /i(x) (or w(x)) of degree 
less than or equal to two. This directly applies to the ex- 
pansion of the characteristic functionals of a random field 
$"[?;] and a point process in terms of cumulants. 

However, for a point process one can see that the ex- 
pansion of the p.g.fl. in terms of /actoria/ cumulants c^^.^ (•) 
(or correlation functions ^fe(-)) allows a truncation at a 
finite k > 2. As long as constraints on the density and 
the correlation functions are fulfilled, the point process 
is well defined. Although the p.g.fl. was used mainly in 
the context of discrete events, it seems worthwhile to con- 
sider the characterizations of random fields with factorial 
cumulants. 

Another systematic expansion is provided by the Edge- 
worth series. It was successfully applied in cosmology to 
quantify the one point probability distribution function 
for the smoothed density field on large scales Re- 
cently, Q showed how to use the truncated Edgeworth 
series to generate realizations of non-Gaussian random 
fields with predefined correlation properties. The trun- 
cated Edgeworth series also violates the positive definite- 
ness of the probability distribution, but restore the 
positivity, reintroducing higher correlations, leading to a 
"leaking" into higher correlations. 

The cumulants Ck and the factorial cumulants of 

a random variable are related by Ck — X]f=i 52(^,0 



(see Eq. (A15)). Looking at the Poisson cluster processes 
discussed in the preceding sections, one observes that 
such a relation must not hold between the cumulants and 
the factorial cumulants of a point process. As an exam- 
ple consider the three-point Poisson cluster process with 
C[„](-) = for all n > 3. A finite C[3](-) leads to non-zero 
c„(-) for all n (see Appendix ^ for details.) 



B. The Poisson model 

In cosmology the point distribution is often related 
to the mass density field assuming the "Poisson model" . 
The value of the mass density field is assumed to be pro- 
portional to the local number density, and the point dis- 
tribution is constructed by "Poisson sampling" the cor- 
related mass density field. If the mass density field is 
itself a realization of a random field, the resulting point 
process is called a Cox process, or doubly stochastic pro- 
cess. Within this model one may show that the cu- 
mulants cJJ of the density field are proportional to the 
factorial cumulants C[„] of the point distribution ||72| , p6| : 
C[„] = cJJ /u^- It is important to notice that this relates 
the characteristic functional of the random field 
with the p.g.fl of the point process G[h]. For the corre- 
lation functions one obtains 



C«(xi, . 



(50) 



Hence, this model allows the direct comparison of pre- 
dictions from analytical calculations with the observed 
correlation functions in the galaxy distribution. 

Clearly the question arises, what is wrong with the 
simple picture that one starts with a Gaussian random 
field and "Poisson sample" it to obtain the desired point 
distribution. The answer is that a Gaussian random field 
is an approximate model for a mass density field only if 
the fluctuations are significantly smaller than the mean 
mass density. Otherwise negative mass densities (i.e. neg- 
ative "probabilities" for the Poisson sampling) would oc- 
cur. Only in the limit of vanishing fluctuations a Poisson 
sampled Gaussian random field becomes a permissible 
model. However, in this limit one is left with a pure 
Poisson process. 



VII. MODELS FOR STRONGLY CORRELATED 
SYSTEMS 



In the Sects. [II and ^ several types of point processes 
were discussed, all featuring a truncated factorial cumu- 
lant expansion. As argued at the beginning of Sect. 
such a truncation is feasible for the matter distribution 
in the Universe, as long as ^2{'i') < 1, i.e. for points 
with large separations. Mainly two approaches have been 
followed to model the galaxy distribution also on small 
scales with ^2 (j^) 3> 1. The hierarchical models are briefly 
discussed in the next section and in Sect, 
tension of the halo-model is presented. 



A. Hierarchical models 

In cosmology one often starts with a scale-invariant 
correlation function ^2(^) oc r~'^ and assumes some clo- 
sure relations for the Especially the hierarchical 



VII B an ex- 
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ansatz f„ = QnTluccsYi^ ^ ^2 was extensively studied 
(e.g. ||7|-|7|,|Q|[, and more recent 0,0). [§ discuss 
conditions for the coefficients Qn such that the expan- 
sion of the p.g.f.'s in terms of the count-in-cells con- 
verges. In this case the count-in-cells uniqu ely d eter- 
mine the point process. As illustrated in Sect. VIII with 
the log-normal distribution, a non-converging expansion 
does not necessarily imply that the stochastic model is 
not well-defined. It only implies that such a point pro- 
cess model is not completely specified by its correlation 
functions. For critical systems similar expansion in terms 
of correlation functions are typically divergent (see e.g. 
@, chapt. 41). 

As another closure relation Kirkwood employed 
the following approximation 

6'3(xi,X2,X3) = 
e3(l+6(xi,X2))(l+6(x2,X3))(l+6(x3,Xl)) (51) 

to calculate thermodynamic properties of fluids using the 
BBGKY hierarchy. This closure relation is exact for the 
log-normal distribution (e.g. [^ ). Empirically however 
one finds that this ansatz is disfavored as a model for the 
galaxy distribution [pl| . 



B. The generalized halo model 

In Sect. several types of Poisson cluster processes 
were constructed by starting with Poisson distributed 
centers and attaching a secondary point process, the clus- 
ter, to each point. One can generalize this procedure by 
considering cluster centers given by already correlated 
points. One possibility, is to iterate the construction 
principle of the simple Poisson cluster process leading 
to the m-th order Neyman-Scott processes |^. If one 
is only interested in the first few correlation functions, 
the full specification of the point process is not neces- 
sary Within the halo model (see e.g. [|l|,0|8|Jl|-|§ ) it 
is specifically easy to calculate the correlation functions. 
The difference to the Poisson cluster processes discussed 
previously is that the cluster centers now may be cor- 
related themselves. The major physical assumption en- 
tering is that the properties of the clusters (halos) are 
independent from the positions and correlations of the 
cluster centers. 

Consider a point process for the cluster centers, the 
parents, specified by the p.g.fl. Gp[h]. Independent from 
the distribution of the centers, a cluster with a p.g.fl. 
Gc[/i|y] is attached to each center y. Then the p.g.fl. of 
this cluster process is given by the "folding" of the two 
p.g.fl.'s M: 



Using the expansion (|^), these p.g.fl.'s are given by 
logGplh + 1] = V ^ / dyi • • • / dy„ 



(53) 



\ogGc[h + l|y] = V ^ / dxi • • • / dx„ 



C[„](xi,... ,x„|y) /i(xi) • • • /i(x„), 



(54) 



(v) 

where Qp is the number density and the ^„ are the corre- 
lation functions of the parent process. The C[„](. . . |y) are 
the factorial cumulants specifying the point distribution 
in a cluster, conditional on the cluster center y. C[i](x|y) 
is the halo profile, with the mean number of points per 
halo /i = J dx C[i](x|y). The C[„], n > 2 quantify the halo 
substructure. A halo without substructure is an inhomo- 
geneous Poisson process, and completely characterized 
by C[i] (x|y) and C[„] = 0, n > 2. Combining Eqs. (jsj 
and 



\ogG[h + l] = V ^ / dyi--- / dy™ 

m 

e^Hyi,--- ,y™) l[{G,[h+l\y.]-l), (55) 

1=1 

one immediately recovers the p.g.fl. of the Poisson cluster 
process Eq. (|l|) by setting = for m > 2 {£,[^'' = 1). 

C. ^2 and ^3 in the generalized halo model 

In the standard halo model the clusters are simply 
modeled by an inhomogeneous Poisson process, whereas 
the centers are given by a correlated point process, typ- 
ically determined from the evolved density distribution. 
Based on these assumptions one can calculate the cor- 
relation functions ^„ for the halo model |2^Jl^. Both 
theoretical models as well as observations suggest that 
dark matter caustics lead to substructure inside halos 
Also recent high-resolution A^-body simulations 
suggest that 15%-40% of the simulated halos show a sig- 
niflcant amount of substructure (see js^ and references 
therein). To generalize the halo model, the correlations 
inside the halo are taken into account. 



G[h] = Gp[Gc[h\- 



(52) 
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Consider the expansion of Gc [/i|yi] — 1 in h: 
Gc[h\yi] - 1 = / dxi h{xi) C[i](xi|yi) + 

'r-^ [ dxi / dx2 h{xi)h{x2) x 

X (c[i](xi|yj)c[i](x2|y,;) +C[2](xi,x2|y,)) + 

f ^ / dxi /" dx2 [ dx3 h{xi)h{x2)h{x3) x 

■J- JR'' jRrf JE<* 

X (c[i](xi|y,)c[i](x2|yi)c[i](x3|yi) + 

+ 3 C[2](xi,X2|yi)c[i](x3|yi) + C3(xi, X2, X3|yj)j 
^0[h% (56) 



After inserting this expansion into Eq. (55) and coUecting 
terms proportional to powers of gh(-), with the mean 
number density g = Qpfi, one can directly compare with 
the expansion (|^) and read off the correlation functions: 

^2(X1,X2) = 

= I dyi (c[i] (xi |yi )c[i] (x2 |yi )+C[2] (xi , X2 |yi ; 

+ ~2 / I dy2 ■?2''^(yi:y2) C[l](xi|yi)c[i](x2|y2), 

(57) 



6(Xl,X2,X3) = 

= ^-^ 1 dyi(c[i](xi|yi)c[i](x2|yi)c[i](x3|yi) + 

+ 3c[i](xi|yi)c[2](x2,X3|yi) + C3(xi, X2, X3|yi) 
+ ^ f dyi / dy2e^''^(yi,y2)x 

X (c[i](xi|yi)c[i](x2|y2)c[i](x3|y2) + 
+ C[i](xi|yi)c[2](x2,X3|y2)^ + 
+ ~3 / / ^^^2 / dya Ci^^(yi,y2,y3) X 

A* JR'i JRd JW 

X C[i](xi|yi)c[i](x2|y2)c[i](x3|y3). (58) 

Similarly the higher n-point correlation functions can be 
calculated. 

In current calculations of the two- and three-point 
functions for the halo model the galaxies inside 

the halos are modeled as an inhomogeneous (finite) Pois- 
son process. The halo profile C[i](x|y) is conditional on 
the cluster center y, but no substructure inside halos is 
present, i.e. C[„] = for n > 2. In this case the above 
expressions simplify to the result of [ pT[ . 

The simulation of such a point distribution can be car- 
ried out in a multi-step approach similar to the simula- 
tion of the Gauss-Poisson process (Appendix B 1 ) . First 



generate the correlated cluster centers, e.g. by using a 
Gauss-Poisson process or a low-resolution simulation, 
and then attach a secondary point process either modeled 
as an inhomogeneous Poisson or n-point Poisson cluster 
process. 



D. Halo substructure 

The following discussion shall serve mainly as an 
illustration of how to incorporate halo substructure 
in calculations of the correlation function. To keep 
things simple the following assumptions are made: the 
halo profile C[i](x|y) = C[i](|x — y|) is independent 
from the mass of the halo, and C[2] (xi, X2|y) fac- 
tors into C[i](xi|y)c[i](x2|y)7(|xi - X2I), as expected 
for locally isotropic substructures. Let P^P^{k) — 
J j0jsC2'\\^\)e~^^'^ be the power spectrum of the spa- 
tial distribution of the halo centers, and let C[i](fc) and 
7(fc) be the Fourier transform of C[i] and 7 respectively. 
The power spectrum of the galaxy distribution in the 
generalized halo model is then 



P{k) 



|ci(fc)|^ 
QpfJ''^ 



=[1] 



(fc)pp(P)(fc) 



dk' 



(fc')p7(|k-k'|). (59) 



This first two terms are the result of |2^], the additional 
term accounts for halo-substructure and involves a fold- 
ing of C[i] with 7 in Fourier-space. Similar expressions 
can be derived from Eq. (58) for the bispectrum. Quan- 
titative predictions for the galaxy distribution, similar 
to the investigations by will be the topic of future 
work. 



VIII. SOME OPEN PROBLEMS 

Our investigations rested on the assumption that the 
correlation functions exist and that the expansions of the 
p.g.fi. converge. In this case the p.g.fl., and consequently 
the point process, is determined completely by the cor- 
relation functions. The first assumption, the existence of 
the correlation functions (the factorial cumulants), does 
not impose dramatic restrictions for the models. In clas- 
sical systems the mean number of points E[7V] as well as 
the factorial moments E[7V(iV- 1) ■ ■ - {N -n+l)] should 
be finite in any bounded domain. For the n-point Poisson 
cluster processes, discussed in the preceding sections, at 
maximum n points reside in a cluster, which are them- 
selves distributed according to a Poisson process with 
constant number density. Clearly in such a simple situ- 
ation both assumptions are satisfied. However, even for 
physically well motivated models, the convergence of the 
expansion of the p.g.fl. may not be guaranteed, although 
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the point process itself and the correlation functions are 
well-defined. 

Perhaps the best known example of a probability dis- 
tribution which is not fully specified by its moments is 
the log-normal distribution. The probability density of 
a log-normal random variable is given by 



1 



ax\'2TT 



exp 



(log a; — x)"^ 
2^2 



(60) 



with parameters x and , the mean and variance of log x. 



The moments (see Eq. (|AT 



2 2 



) are 



well-defined, however the expansion ( [ASj ) of the char- 
acteristic function is not convergent. And indeed 
showed that the probability density 



p'(x) = p{x) 



1 + esm ^ — ^(logx 



(61) 



where < e < 1 and fc is a positive integer, has moments 
identical to the moments of the log-normal distribution. 
A comparison of p{x) and p'{x) is shown in Fig. ^. 




FIG. 4. The probability density of the log-normal dis- 
tribution (solid line, a; = and a — 0.8) and a probability 
density from the family of distributions with the same mo- 
ments (dashed line, e = 0.5 and A: = 1). 

A log-normal random field (an "exponentiated" Gaus- 
sian random field) is positive at any point in space, and 
a point process can be constructed using the value of 
the field as the local number density. The multivari- 
ate log-normal distribution, and the log-normal random 
field inherit the behaviour of the moments of the simple 
log-normal distribution. The point distribution obtained 
from the "Poisson sampled" log-normal random field is 
not characterized completely by its correlation functions 
as already discussed by ||8^. See also |Q for a similar 
approach towards this "log-Gaussian Cox process" . 



In a Poisson cluster process (and also in the halo- 
model) the point distribution inside the cluster is spec- 
ified independently from the distribution of the centers. 
This constructive approach, and the truncation of the 
moment expansion, guarantee the existence of these pro- 
cesses. A characterization result for the generalized Her- 
mite distribution, closely related to the general n-point 
process considered in Sect. VP, is discussed by [|60[. 



Also well-defined point processes which do not impose 
such a truncation of the moment expansion are possi- 
ble. A simple model is the line-segment process used 
in Sect. IVA, where the number of points per cluster 
is a Poisson random variable. Attempts towards a gen- 
eral characterization of point processes were conducted 
by |]87|-|89|], and partially succeeded for the case of in- 
finitely divisible point processes. 

One can show that any regular infinitely divisible point 
processes is a Poisson cluster process (e.g. regular 
means that a cluster with an infinite number of points has 
probability zero). An infinitely divisible point processes 
may be constructed as a superposition of any number of 
independent point processes. It is interesting to note that 
the log-normal distribution is infinitely divisible ]9C| ] , al- 
though the expans ion of the characteristic function in 
terms of moments (A5) docs not converge. 

On small scales the galaxy correlation function is scale 
invariant: ^2 oc r~'' . If a cut-off at some large scales is 
present, and the constraints for the density and the cor- 
relation functions are satisfied, a model based on a Pois- 
son cluster process becomes feasible. Unfortunately, the 
superposition of independent point processes, as implied 
by the infinite divisibility of a Poisson cluster process, 
does not seem to be a good model assumption for the 
interconnected network of correlated walls and filaments, 
as observed in the galaxy distribution (e.g. |^). The 
correlation functions for the galaxy distribution are close 
to zero for large separations, but from current observa- 
tions one can not infer a definite cut-off. As discussed 



in Sect. [Ill C| for the Gauss-Poisson process, the large- 
scale behaviour of the correlation functions plays an im- 
portant role in the construction of the Poisson cluster 
processes. Moreover, the dynamical equations governing 
the evolution of large-scale structures are non-local (see 
p2[ and references therein). Therefore it seems worth- 
while to consider also point process models which are 
not infinitely divisible. Unfortunately, beyond infinitely 
divisible point processes it is not clear what kind of prop- 
erties the correlation functions have and especially what 
kind of additional constraints arise. 



IX. SUMMARY AND CONCLUSION 

The Gaussian random field, fully specified by the mean 
and its correlation function, is one of the reference mod- 
els employed in cosmology. Typical inflationary scenar- 
ios suggest that the primordial mass-density field is a 
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realization of a Gaussian random field. Non-Gaussian 
features in the present day distribution of mass may be 
attributed either to the non-linear process of structure 
formation, or to a non-Gaussian primordial density field. 
Observations of the large-scale distribution of galaxies 
however provide us with a distribution of points in space. 
The process of galaxy formation may introduce further 
non-Gaussian features in the galaxy point distribution. 
In this paper a direct approach towards the characteriza- 
tion of this point set was pursued. The statistical prop- 
erties of the point distribution can be specified by the 
sequence of correlation functions In close analogy to 
the Gaussian random field, a Gaussian point distribu- 
tion, the Gauss-Poisson point process, was constructed. 
This random point set is fully specified by its mean num- 
ber density g, the two-point correlation function ^2(^)1 
and ^„ = for n > 2. Important constraints on g and 
£.2(1"), not present for the Gaussian random field, show 
up. Namely, ^2('") > for all r, and the variance of the 
number of points must not exceed twice the value of a 
Poisson process. The violation of these constraints indi- 
cates non-Gaussian features in the galaxy distribution. 
The equivalence of the Gauss-Poisson point process with 
a Poisson cluster point process leads to a simple sim- 
ulation algorithm for such a point distribution. Using 
the J-function, higher-order correlations were detected 
in both a two-dimensional example and the galaxy dis- 
tribution. The comparison with the Gauss-Poisson point 
process allows us to quantify the level of significance of 
these non-Gaussian features. Using these methods ||9^ ] 
could show that the distribution of galaxy clusters may 
not be modeled by a Gauss-Poisson process at a signifi- 
cance level of 95%. 

The formal approach based on the probability generat- 
ing functional (p.g.fl.) facilitated the definition, the char- 
acterization, and the simulation of the Gauss-Poisson 
point process. The inclusion of higher-order correlation 
functions was straightforward, leading to the n-point 
Poisson cluster processes. Both the definition and the 
simulation algorithm were detailed for the three-point 
Poisson cluster process. The Gauss-Poisson point pro- 
cess and the two-point Poisson cluster process are equiv- 
alent. However, this is not true for the n-point case n > 2 
anymore. The set of general n-point processes, resulting 
from a truncation of the cumulant expansion of the p.g.fl. 
after the n-th order, contains all n-point Poisson clus- 
ter processes as a true subset. This was discussed for 
the three-point case explicitly. Although models based 
on the n-point Poisson cluster process arc not the most 
general ones, they cover a broad range of clustering point 
distributions. A Poisson cluster process can be simulated 
easily and is especially helpful for comparing statistical 
methods and estimators. 

The inclusion of more and more points in the ran- 
domly placed clusters is only one way to extend the 
Gauss-Poisson point process into the strongly-correlated 
regime. In the halo model one allows for correlations be- 
tween the cluster centers. Typically the halo (i.e. the 



galaxy cluster) is modeled without substructure. Again 
using the p.g.fl., the influence of correlations inside a halo 
on the n-point correlation functions of the resulting point 
distribution could be calculated. 

All the models discussed above offer some insight into 
certain aspects of the clustering of the galaxy distribu- 
tion. As argued in the preceding section, point process 
models which are not decomposable into independent 
point processes seem more appropriate. Unfortunately, 
even basic mathematical questions concerning the (com- 
plete) characterization of these models in terms of mo- 
ments and beyond arc still open. 
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APPENDIX A: CHARACTERISTIC AND 
GENERATING FUNCTIONS OF RANDOM 
VARIABLES 

A short review dealing with the characteristic and 
probability generating function (p.g.f.) of a random 
variable and their expansions in terms of moments, cu- 
mulants, factorial moments, and factorial cumulants is 
given. This Appendix is meant to serve as an illustra- 
tion highlighting the analogies between expansions of the 
probability generating functional (p.g.fl.) and the prob- 
ability generating function (see also @). To keep this 
summary simple it is assumed that the moments etc. ex- 
ist and the expansions converge. For a more thorough 
treatment of characteristic and generating functions see 
e.g. 

The moments of a random variable x with probability 
distribution F are defined by 

rnfc = E[x''] = J dF{x) x^ . (Al) 

If n is a discrete random variable, especially if n is 
integer-valued and greater equal zero, it is often more 
convenient to work with the factorial moments: 

/oo 
dF(n)nW=^p„nW, (A2) 

where n^^^ = n{n — 1) ■ ■ ■ {n — k + 1), and Pn is the prob- 
ability that the random variable takes the value n. Sim- 
ilarly, for point processes it is more convenient to work 
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with product densities (or factorial moment measures), 
instead of moment measures. 

The characteristic function (p{t), t G M of a distribution 
F is defined as 



/oo 
dF{x) e''^ 
-OO 



(A3) 



and serves as a generating function for the moments. Ex- 
panding the exponential one can easily verify that 



ruk 



-i)' 



,dV(s) 



s=0 



(A4) 



By inverting one obtains the expansion of ^p{s) in terms 
of moments: 



k=Q 



{isY' 



-rrik. 



(A5) 



The probability generating function (p.g.f.) P{z) of a 
random variable n is defined as 



P(z) =E[z"]. 



(A6) 



Note that P(e**) = (p{t). For a nonnegative integer- 
valued random variable one obtains the expansions 



oo 



fe=i 



(A7) 
(A8) 



in terms of the probabilities p„. P{1 + t) serves as the 
generating function for the factorial moments r7i[fc] . Sim- 
ilarly, the product densities (factorial moment measures) 
for a point process can be derived as functional deriva- 
tives (Frechet derivatives) of the probability generating 
functional. 

Using P(e**) = ip{t) one can derive the relation be- 
tween moments and factorial moments: 

(is)'' 



fc=0 



(is^ 



1 + — - m[fe] S2{t, k), 



f=0 



fc=i 



where 



S2(t,fc) = ^(-1)'=- 



1=1 



{k~iy. II 



(A9) 



(AlO) 



are the Stirling numbers of the second kind, the number 
of partitions which split the set {1, . . . , t} into k pairwise 
disjoint nonempty sets (see e.g. |Q). Since S2{t,k) = 



for k > t, which is also respected by the expression ( AlCj ), 
one finally arrives at 



t 

mt = ^S2(t, A:) m[fc]. 

fc=i 



(All) 



One considers not only the moments rUk of a random 
variable, but also the cumulants Ck defined by 



exp 



. fc=i 



kl 



-Ck 



k=l 



Clearly, log (p{s) serves as a generating function for the 
cumulants Cfc. Perhaps the best known cumulant is the 
variance 



C2 — cr^ = m2 — m\. 



(A13) 



Similarly, for nonnegative integer valued random variable 
the factorial cumulants C[k] are defined by 



exp 



k\ 



t 

1 + E fc! "^W 



V fc=l 



3[fe] 



P{l + t). (A14) 



fc=i 



Hence, logP(l -I- t) serves as the generating function of 
the factorial cumulants c^k] ■ The correlation functions ^„ 
used in cosmology, are the densities of the normalized 
factorial cumulant measures of a point processes, corre- 
sponding to the factorial cumulants C[k] of a discrete ran- 
dom variable. Setting t — e** — 1 i n Eq. (A14) and com- 
paring term by term with Eq. (A12) one obtains the same 
relation between cumulants and factorial cumulants, as 
between moments an factorial moments: 



Cfe 



(A15) 



1=1 



APPENDIX B: SIMULATION ALGORITHMS 



1. The Gauss— Poisson point process 



As discussed in Sect. [II B every Gauss-Poisson pro- 
cess is a Poisson cluster process and therefore can be 
simulated easily. For a given number density g and a 
two-point correlation function ^2('') fulfilling the con- 
straints (^2|) and ([l3|), realizations of the Gauss-Poisson 
process can be generated straightforwardly. With the 
normalization C2 = J^d dx ^2(|x|) and J^a dx /(|x|) = 1 
one calculates the quantities needed for the simulation: 
f{r) = 6(7')/C2, q2 = 2^f^> 9i = 1 " 92, and Qp = 
g{l — gC2/2). The constraint ([l6| ) now can be written as 
< 1. The simulation is carried out in two steps: 

• First generate the cluster centers according to a 
Poisson distribution with munber density Qp. 
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• For each cluster center x draw a uniform random 
number z in [0, 1]. If z < gi, then keep only the 
point X. If z > q\ then keep the point x and addi- 
tionaUy chose a random direction on the unit sphere 
and a distance d with the probability density / and 
place the second point according to them. 

To get the correct point pattern inside a given window, 
one also has to use cluster centers outside the window 
to ensure that any possible secondary point inside the 
window is included. 



2. The three— point Poisson cluster process 

In the following the algorithm for the simulations of the 
three-point cluster process is given. The expressions (^) 
together with the normalization conditions for ji and /a 
serve as a starting point. An algorithm similar to the one 
for the Gauss-Poisson process described in Appendix. B 1 
can be constructed: 

Given the number density and the two- and three 
point correlation functions, ^2 and ^3, one defines 



I dx $2(1x1) and C3 - / dx / dyC3(0,x,y). 

(Bl) 

Using the normalization of /2 and /a one obtains 



(12 
Qp 



(6-3C2£< + C3e2)' 



(B2) 



6 



gi = 1 - (72 - ga, 



resulting in 

/2(xi,X2) 

(X1,X2) 
/3(xi,X2,X3) 



— (C2(X1,X2) - 

~ Q I dx3 ^3(xi,X2,X3)), 
/ dx3 C3(X1,X2,X3), (B3) 



— ^3(xi,X2,X3) 



Since Qc and the g„ are positive numbers, the constraints 
Cd.Q^ < C2Q < 2 + C^g^ /3 must be satisfied and the 

relation ^2 = (^2/2 + C^Qifi ~ H ) holds. The algorithm 
now reads: 

• First generate the cluster centers according to a 
Poisson distribution with number density gp. 



• For each cluster center x draw a uniform random 

number z in [0,1]. If z < 51, then keep only the 

point X. li qi < z < qi + q2 then keep the point 

X and additionally chose a random point according 

to the probability density /2. If <Zi + ?2 < then 

keep the point x, chose a second point according 
. (3) 

to the probability density , and a third point 
according to f^. 



APPENDIX C: CUMULANTS AND FACTORIAL 
CUMULANTS 

Consider the expansion (0) of the p.g.fi. in terms of the 
factorial cumulants C[fc](xi, . . . ,Xfc): 

logG[/i] = / dxi-.. / dxfc 

C[A;](X1, . . . ,Xfe)(/li - 1) • • • (/ife - 1) 

= y2ll I dxi---/ dxfe C[fc](xi, . . . ,Xfe) 

^ JR<J jRd 

n=l /eJfc 

(CI) 



with hi = h{xi) and is formed by the ordered subsets 
of {1, . . . ,k} with n < k distinct entries. Hence, a sub- 
set / £ J^' consists out of n distinct numbers {/i, . . . , /«} 
with /!<...< /„, e.g J| = {{1, 2}, {1, 3}, {2, 3}}. Us- 
ing ^[h] = G[e^^] one obtain 

log$[/i] = V 7^ / dxi---/ dxfe C[fc](xi, . . . ,Xfc)x 
j,^^ Jr.'' Jr'' 

= E7t/ dxi---[ dxfc C[fe)(xi, . . . ,Xfc) X 

f'- JR'' JR'' 



(-i)'- + EbEM)^-"E(^^. 



ml 

m— n—1 



(C2) 
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In the m sum the first term with m = equals —(—1)'^, 
canceling with the (—1)'^ inside the braces: 

log$[/i]=V^/ dxi---/ dxfc C[fc](xi, . . . ,Xfc)x 



oo 



m—l n—1 



- 

nr. 



Eh' 

fe=i 



Ji=l Itzjk 



dxfe C[fe](xi 



The expression inside the braces 
dxi • • • / dx™ Cfe(xi, . . . 



,Xfe) X 



(C3) 
in dC^ ) equals 

) h^---h^. (C4) 



This fixes the relation between the cumulants and fac- 
torial cumulants. However, there is no straightforward 
way to simplify this expression. Above all the theorem 
of Marcinkiewicz demands that as soon as fc > 2, th e 
fc-sum always has to be an infinite sum (see Sect. VIA). 
Hence, the cumulants Cfc(-) depend on an infinite alter- 
nating sum of the factorial cumulants C[fc](-), and vice 
versa. 
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